ETL4LinkedProv: Managing Multigranular Linked Data Provenance
نویسندگان
چکیده
This article presents the ETL4LinkedProv approach to manage the collection and publication of provenance with distinct levels of granularity as Linked Data. The proposed approach uses ETL-workflows and a component named Provenance Collector Agent to collect two kinds of provenance (prospective and retrospective) integrating them with domain data. The component also set the granularity of the provenance to be captured. Furthermore, ETL4LinkedProv is evaluated in a real world scenario where governmental Brazilian agencies produce and publish public data sources as Linked Data. In this article we also measure the amount of the provenance generated in the runtime of ETL-workflows and in the number of published RDF triples.
منابع مشابه
Gerência de Proveniência Multigranular em Linked Data com a Abordagem ETL4LinkedProv
This paper presents the ETL4LinkedProv approach to manage the collection and publication of provenance metadata with different levels of granularity, as Linked Data. The approach uses ETL workflows and a novel component named Provenance Collector Agent. Its application in a real scenario is presented and the impact of the fine-grained provenance in the ETL workflow runtime and in the number of ...
متن کاملExploring the Evolution and Provenance of Git Versioned RDF Data
The distributed character and the manifold possibilities for interchanging data on the Web lead to the problem of getting hold of the provenance of the data. Especially in the domain of digital humanities and when dealing with Linked Data in an enterprise context provenance information is needed to support the collaborative process of data management. We are proposing a possibility for capturin...
متن کاملDEMO: Managing the Provenance of Crowdsourced Disruption Reports
Human computation systems that outsource tasks to the crowd often have to address issues associated with the quality of contributions. We are exploring the potential role of provenance to facilitate processes such as quality assessment within such systems. In this demo we present an application for managing traffic disruption reports generated by the crowd, and outline the technologies used to ...
متن کاملModelling provenance of DBpedia resources using Wikipedia contributions
DBpedia is one of the largest datasets in the Linked Open Data cloud. Its centrality and its cross-domain nature makes it one of the most important and most referred to knowledge bases on the Web of Data, generally used as a reference for data interlinking. Yet, in spite of its authoritative aspect, there is no work so far tackling the provenance aspect of DBpedia statements. By being extracted...
متن کاملProvenance Information in the Web of Data
The openness of the Web and the ease to combine linked data from different sources creates new challenges. Systems that consume linked data must evaluate quality and trustworthiness of the data. A common approach for data quality assessment is the analysis of provenance information. For this reason, this paper discusses provenance of data on the Web and proposes a suitable provenance model. Whi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JIDM
دوره 7 شماره
صفحات -
تاریخ انتشار 2016